通过移动机器人收集数据的自动化有望提高环境调查的功效,但要求该系统自主确定如何在避免障碍的同时采样环境。现有的方法,例如Boustrophedon分解算法,可以将环境完全覆盖到指定的分辨率上,但是在许多情况下,分布分辨率进行采样将产生长的路径,并具有不可算数的测量值。减少这些路径可能会导致可行的计划,而以分配估计精度为代价。这项工作探讨了分布精度和小路分解算法的路径长度之间的权衡。我们通过计算指标来量化算法性能,以在环境分布中计算蒙特卡洛模拟中的准确性和路径长度。我们强调的是,应将一个目标优先于另一个目标,并提出对算法的修改,以通过更均匀地采样来提高其有效性。这些结果证明了Boustrophedon算法的智能部署如何有效指导自主环境抽样。
translated by 谷歌翻译
Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods. In light of these results, it remains unclear for practitioners how to use these methods and choose between them in a principled way. In this paper, we show that for even moderately rich model classes (easily satisfied by neural networks), any feature attribution method that is complete and linear--for example, Integrated Gradients and SHAP--can provably fail to improve on random guessing for inferring model behaviour. Our results apply to common end-tasks such as identifying local model behaviour, spurious feature identification, and algorithmic recourse. One takeaway from our work is the importance of concretely defining end-tasks. In particular, we show that once such an end-task is defined, a simple and direct approach of repeated model evaluations can outperform many other complex feature attribution methods.
translated by 谷歌翻译
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g., datapath, memory blocks in different hierarchies, interconnects, compiler optimization, etc.) quickly results in a combinatorial explosion of design space. This makes domain-specific customization an extremely challenging task. Prior work explores using reinforcement learning (RL) and other optimization methods to automatically explore the large design space. However, these methods have traditionally relied on single-agent RL/ML formulations. It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e.g., full stack System-on-Chip design). Therefore, we propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem. The key idea behind using MARL is an observation that parameters across different sub-systems are more or less independent, thus allowing a decentralized role assigned to each agent. We test this hypothesis by designing domain-specific DRAM memory controller for several workload traces. Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines such as Proximal Policy Optimization and Soft Actor-Critic over different target objectives such as low power and latency. To this end, this work opens the pathway for new and promising research in MARL solutions for hardware architecture search.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. This applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable. In this paper we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides $\epsilon d$-privacy for deep learning training, rather than the $(\epsilon, \delta)$-privacy of the Gaussian mechanism; and that experimentally, on key datasets, the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off.
translated by 谷歌翻译
本文解决了逆增强学习(IRL)的问题 - 从观察其行为中推断出代理的奖励功能。 IRL可以为学徒学习提供可概括和紧凑的代表,并能够准确推断人的偏好以帮助他们。 %并提供更准确的预测。但是,有效的IRL具有挑战性,因为许多奖励功能可以与观察到的行为兼容。我们专注于如何利用先前的强化学习(RL)经验,以使学习这些偏好更快,更高效。我们提出了IRL算法基础(通过样本中的连续功能意图推断行为获取行为),该算法利用多任务RL预培训和后继功能,使代理商可以为跨越可能的目标建立强大的基础,从而跨越可能的目标。给定的域。当仅接触一些专家演示以优化新颖目标时,代理商会使用其基础快速有效地推断奖励功能。我们的实验表明,我们的方法非常有效地推断和优化显示出奖励功能,从而准确地从少于100个轨迹中推断出奖励功能。
translated by 谷歌翻译
培训和评估语言模型越来越多地要求构建元数据 - 多样化的策划数据收集,并具有清晰的出处。自然语言提示最近通过将现有的,有监督的数据集转换为多种新颖的预处理任务,突出了元数据策划的好处,从而改善了零击的概括。尽管将这些以数据为中心的方法转化为生物医学语言建模的通用域文本成功,但由于标记的生物医学数据集在流行的数据中心中的代表性大大不足,因此仍然具有挑战性。为了应对这一挑战,我们介绍了BigBio一个由126个以上的生物医学NLP数据集的社区库,目前涵盖12个任务类别和10多种语言。 BigBio通过对数据集及其元数据进行程序化访问来促进可再现的元数据策划,并与当前的平台兼容,以及时工程和端到端的几个/零射击语言模型评估。我们讨论了我们的任务架构协调,数据审核,贡献指南的过程,并概述了两个说明性用例:生物医学提示和大规模,多任务学习的零射门评估。 BigBio是一项持续的社区努力,可在https://github.com/bigscience-workshop/biomedical上获得。
translated by 谷歌翻译
多任务学习经常用于对一组相同功能集的一组相关响应变量进行建模,从而相对于分别处理每个响应变量的方法提高了预测性能和建模精度。尽管多任务学习的潜力比单任务替代方案具有更强大的推理,但该领域的先前工作在很大程度上忽略了不确定性量化。我们在本文中的重点是神经影像学中常见的多任务问题,其目标是了解多个认知任务分数(或其他主题级评估)与从成像收集的脑连接数据之间的关系。我们提出了一个选择性推断以解决此问题的框架,并具有以下灵活性:(i)通过稀疏性惩罚共同确定每个任务的相关协变量,(ii)基于估计的稀疏性在模型中进行有效推理结构体。我们的框架为推理提供了新的有条件过程,基于选择事件的改进,该事件产生了可拖延的选择调整后的可能性。这给出了最大似然推理的估计方程式的近似系统,可通过单个凸优化问题解决,并使我们能够在大约正确的覆盖范围内有效地形成置信区间。我们的选择性推理方法应用于青少年认知大脑发展(ABCD)研究的模拟数据和数据,比常用的替代方案(例如数据拆分)产生了更紧密的置信区间。我们还通过模拟证明,与单任务方法相比,具有选择性推理的多任务学习可以更准确地恢复真实信号。
translated by 谷歌翻译
中风康复旨在通过功能运动的重复实践来增加神经塑性,但由于重复不足,对恢复可能具有最小的影响。最佳培训内容和数量目前未知,因为不存在测量它们的实用工具。在这里,我们呈现Primseq,一个管道来分类和计算在笔划康复中培训的功能动作。我们的方法集成了可穿戴传感器来捕获上体运动,深度学习模型来预测运动序列,以及对Tally Motions的算法。训练有素的模型将康复活动分解成组件功能运动,优于竞争性机器学习方法。 Primseq进一步在人类专家的时间和劳动力成本的一小部分中量化了这些动作。我们展示了以前看不见的中风患者的Primseq的能力,这是一系列上肢电机损伤。我们预计这些进步将支持在中风康复中定量给药试验所需的严格测量。
translated by 谷歌翻译
已被广泛提出规范作为一种协调和控制多助理系统(MAS)的代理活动的方式。规范指定代理人应遵循的行为以实现MAS的目标。然而,实现特定系统目标的设计规范可能是困难的,特别是当没有所说的系统目标的语言之间没有直接链接以及可以表达规范的语言时。在本文中,我们考虑从代理行为的痕量痕量综合综合标准的问题,其中每条迹线都标记了该行为是否满足系统目标。我们表明规范合成问题是NP完整的。
translated by 谷歌翻译